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Abstract 

Shannon's entropy power inequality (EPI) can be viewed as a statement of concavity of an 
enfropic funcfion of a confinuous random variable under a scaled addition rule: 

f{V^X + Vl-aY) >fl/(X) + (l-fl)/(y) Vae [0,1]. 

Here, X and Y are continuous random variables and the function / is either the differential 
entropy or the entropy power. Konig and Smith [IEEE Trans. Inf. Theory. 60(3):1536-1548, 2014] 
and De Palma, Mari, and Giovannetti [Nature Photon. 8(12):958-964, 2014] obtained quantum 
analogues of fhese inequalities for continuous-variable quantum systems, where X and Y are 
replaced by bosonic fields and the addition rule is the action of a beamsplitter with transmissivity 
a on those fields. In this paper, we similarly establish a class of EPI analogues for d-level quantum 
systems (i.e. qudits). The imderlying addition rule for which these inequalities hold is given by 
a quantum channel that depends on the parameter a G [0,1] and acts like a finite-dimensional 
analogue of a beamsplitter with transmissivity a, converting a two-qudit product state into a 
single qudit state. We refer to this channel as a partial swap channel because of the particular way 
its output interpolates between the states of the two qudits in the input as a is changed from 
zero to one. We obtain analogues of Shannon's EPI, not only for the von Neumann entropy 
and the entropy power for the output of such channels, but for a much larger class of fimctions 
as well. This class includes the Renyi entropies and the subentropy. We also prove a qudit 
analogue of the entropy photon number inequality (EPnl). Pinally, for the subclass of partial 
swap channels for which one of the qudit states in the input is fixed, our EPIs and EPnl yield 
lower bounds on the minimum output entropy and upper bounds on the Holevo capacity. 


1 Introduction 

Inequalities between entropic quantities play a fundamental role in information theory and have 
been employed effectively in finding bounds on optimal rates of various information-processing 
tasks. Shannon's entropy power inequality (EPI) [Sha48] is one such inequality and it has proved to 
be of relevance in studying problems not only in information theory, but also in probability theory 
and mathematical physics [Sta59]. It has been used, for example, in finding upper bounds on the 
capacities of certain noisy channels (e.g. the Gaussian broadcast charmel [Ber74]) and in proving 
convergence in relative entropy for the Central Limit Theorem [Bar86]. 
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Classical EPIs 


For an arbitrary random variable X on with probability density function (p.d.f.) fx, the entropy 
power of X is the quantity 


u(X) 


g2H(X)/rf 

Ine 


( 1 . 1 ) 


where H(X) is the differential entropy of X, 


W(X) := - f fxix) log fx{x)dx 


( 1 . 2 ) 


(throughout the paper we use log to represent the natural logarithm). The name "entropy power" 
is derived from the following fact: if X is a Gaussian random variable on R with zero mean and 
variance then H(X) = (1/2) \og{2nea^)', hence v{X) is equal to its variance, which is commonly 
referred to as its power. Note that the entropy power of a random variable X is equal to the variance 
of a Gaussian random variable which has the same differential entropy as X. For X on R‘^, we shall 
henceforth omit the factor l/27re and refer to e 2 H(x)/i entropy power, as in [KS14]. 

The entropy power satisfies the following scaling property: v{^/aX) = oiv{X). This follows from 
the scaling property of p.d.f.s: if /^x denotes the p.d.f. of a random variable aX on R'^, where a > 0, 
then fax{x) = a.^‘^fx{x/oc), x G R'^, which in turn implies that H{ocX) = H(X) + d log oc. This 
shows why the factor l/d in the definition of i?(X) has to be there for X on R'^. 

Sharmon's EPl [Sha48] provides a lower bound on the entropy power of a sum of two indepen¬ 
dent random variables X and Y on R^^ in terms of the sums of the entropy powers of the individual 
random variables: 

v{X + Y)>v{X) + v{Y), (1.3) 


or equivalently. 


g2H(X+Y)/d > g2H(X)/d _^g2H(Y)/d 


(1.4) 


Here, H(X + Y) is the differential entropy of the p.d.f. of the sum Z := X + Y, which is given by 
the convolution 

fx+Y{x) = {fx* fY){x) ■■= [ fx{x')fY{x- x')dx', V X G R'^. (1.5) 


The inequality eq. (1.3) was proposed by Shannon in [Sha48] as a means to bound the capacity 
of a non-Gaussian additive noise channel, that is, a channel with input X and output X -|- Y, with 
Y being an independent (non-Gaussian) random variable modeling the noise which is added to 
the input. Later, Lieb [Lie78] and Dembo, Cover, and Thomas [DCT91] (see also [VG06]) showed 
that the EPl (1.4) can be equivalently expressed as the following inequality between differential 
entropies: 

H(^X +Vl -flY) > flH(X) + (1 -fl)H(Y), VflG[0,l]. (1.6) 

The above inequality was proved by employing the Renyi entropy [Ren61] and using properties of 
p-norms on convolutions given by a sharp form of Young's inequality [Bec75]. 

The form of the EPl in eq. (1.6) motivates the definition of an operation (which following [KS13b, 
KS14] we denote as ffl,,) on the space of random variables, given by the following scaled addition 
rule: 

XSaY := ^/dX + Vl^Y, Vfl G [0,1]. (1.7) 

The random variable X ffl,, Y can be interpreted as an interpolation between X and Y as a is 
decreased from 1 to 0. With this notation, the inequality (1.6) can be written as 


H(Xffl« Y) > flH(X) + (1 -fl)H(Y), Vfl G [0,1]. 


( 1 . 8 ) 
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(1.9) 


Using the scaling property of the entropy power, the EPI (1.4) can be expressed as follows: 

^2H(XEa„y)/<f > «g2H(X)/rf ^ (1 _ «)g2H(y)/rf^ 

Shannon's EPI (1.4) (and hence also (1.8) and (1.9)) was first proved rigorously by Stam [Sta59] and 
by Blachman [Bla65], by employing de Bruijn's identity, which couples Pisher information with 
differential entropy. Since then various different proofs and generalizations of the EPI have been 
proposed (see e.g. [VG06, Rioll, SS15] and references therein). 

It is natural to conjecture that an analogue of Sharmon's EPI also holds for discrete random 
variables e.g. on non-negative integers. This conjecture was first proved by [HV03] for the case of bi¬ 
nomial random variables. They proved that if X„ ~ Bin(n, p), then for p = 1/2 (see also [SDMll]): 

^2H(X„+X„,) > ^2H{X„) ^ ^2H{X„)^ \/m,n>l. (1.10) 

Purther, Johnson and Yu [JYIO] established a form of the EPI which is valid for ultra log-concave 
discrete random variables (see Definition 2.2 of [JYIO]), whereby the scaling operation of a con¬ 
tinuous random variable was suitably replaced by the so-called thinning operation introduced by 
Renyi [Ren56], which is considered to be an analogue of scaling for discrete random variables. 

Quantum analogues of EPIs 

The discovery of an analogue of the EPI in the quantum setting by Konig and Smith [KS14] marked 
a significant advance in quantum information theory. They proposed an EPI which holds for 
continuous-variable quantum systems that arise, for example, in quantum optics. In this case, the 
random variables X, Y of the classical EPIs (eq. (1.8) and eq. (1.9)) are replaced by quantum fields, 
bosonic modes of electromagnetic radiation, described by quantum states px, pr, which act on a 
separable, infinite-dimensional ITilbert space H. The differential entropy is accordingly replaced 
by the von Neumann entropy H{p) := — Tr(plogp). 

A prerequisite for any quantum analogue of the EPI is the formulation of a suitable analogue 
of the addition rule (1.7) which can be applied to pairs of quantum states. Since the quantum- 
mechanical analogue of additive noise can be modelled by the mixing of two beams of light at 
a beamsplitter, Konig and Smith considered the parameter a in eq. (1.7) to be the beamsplitter's 
transmissivity. The classical addition rule eq. (1.7) is thereby replaced by an analogous quantum field 
addition rule for the field operators. In particular, if the two input signals are m-mode bosonic fields, 
with armihilation operators and respectively, then the output is an m-vcvode 

bosonic field with armihilation operators Ci,..., Cm, where 

Ci ■.= ^/adi-y — abi. ( 1 . 11 ) 

In a state space description, the input signals are described by quantum states px, pr on 71. This 
yields an equivalent quantum state addition rule, where the beamsplitter converts the incoming state 
Px ® Py to a state px Ha Py given by 

(Px/Py) '-t Px Ea Py := £a{px G py)- (1.12) 

Here, Ea is a linear, completely positive trace-preserving map defined through the relation 

^a(Pxy) := TrY(UflPxYHa), (1.13) 

with the partial trace being taken over the second system, and Ua is the unitary operator describing 
the action of the beamsplitter on the state space 77. Analogous to the classical case, the state 
Px Ea Py reduces to px when a = 1, and to py when a = 0. 
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Konig and Smith [KS14] proved that the following quantum analogues of the EPls (1.8) and 
(1.9) hold, under the quantum addition rule given by eq. (1.12): 


H{px fflfl Py) > aH{px) + (1 - a)H{pY), 


(1.14) 

(1.15) 


where m is the number of bosonic modes. The inequality (1.15) corresponds to a 50:50 beamsplitter 
(i.e., a beamsplitter with transmissivity a = 1/2). Later, De Palma et al. [DMG14] proved that an 
analogous inequality also holds for any beamsplitter (i.e., for any a G [0,1]) and is given by the 
following: 

gH(pxES„py)/m > ae^{px)/m ^ (1 _ g [g,!]. (1.16) 

Note that the EPl given by eq. (1.15) seems to differ from its classical counterpart (1.9) by a factor of 
2 in the exponent. However, one can argue that the dimension of the bosonic phase space is d = 2m 
(as there are 2 quadratures per mode). These EPIs have found applications for bounding classical 
capacity of bosonic channels [KS13b, KS13a]. 

The above inequalities do not reduce to the classical EPIs (1.8) and (1.9) for commuting states; 
in other words, they are not quantum generalizations of the Sharmon's original EPI in the usual 
sense, as they do not include the latter as a special case. This is because the addition rule acts at 
the field operator level and not at the state level. In fact, the dependence of the output state on the 
parameter a is much more complicated than in the classical case. 

Another inequality, related to the EPI (1.16), was conjectured by Guha et al. [GES08] and is 
known as the entropy photon number inequality (EPnl). The thermal state of a bosonic mode with 
annihilation operator a can be expressed as [GSE08]: 


pt = Y1 


N' 


;^o(N + l)'+i 


10(^1 


(1.17) 


where N := Tr{pTa^d) is the average photon number of the state pj- Its von Neumann entropy 
can be evaluated as H(pr) = g{N) where g{x) := (1 + x) log(l + x) — zlogx. Inverting this, the 
photon number of pj is then N = g~^{H{pT)). Gorrespondingly, the photon number of an wj-mode 
bosonic state p is defined as N{p) := g^^{H{p)/m). Guha et al. [GES08] conjectured that 

N(pxfflflPY) > «N(px) + (1 -fl)N(pY), Vfle[0,l], (1.18) 


where Ea is again the quantum state addition rule (1.12). This conjecture is of particular significance 
in quantum information theory since if it were true then it would allow one to evaluate classical 
capacities of various bosonic channels, e.g. the bosonic broadcast charmel [GSE07] and the wiretap 
channel [GSE08]. It has thus far been proved only for Gaussian states [Guh08]. 

A natural question to ask is whether quantum EPIs can also be found outside the continuous- 
variable setting. In this paper, we address this question by formulating an addition rule for d-level 
systems (qudits) in the form of a quantum charmel Sa, which we call the partial swap channel, that acts 
on the two input quantum states. We then prove analogues of the quantum EPIs (1.14) and (1.16) 
for this addition rule. We also prove similar inequalities for a large class IF of functions, including 
the Renyi entropies of order a G [0,1) and the subentropy [JRW94]. Again these are analogues and 
not generalizations of the classical EPIs for discrete random variables [HV03, JYIO, SDMll] to the 
non-commutative setting, as the latter do not emerge as special cases for commuting states. 

Furthermore, the concept of entropy photon number N has a straightforward generalization to 
qudit systems via its one-to-one relation with the von Neumann entropy, H = g{N), even though 


4 



it loses its interpretation as an average photon number. We show that the function is in the 
class and as a result obtain the EPnl for our qudit addition rule. 

Finally, we apply our results (EPIs and EPnl) to obtain lower bounds on the minimum output 
entropy and upper bounds on the Holevo capacity for a class of single-input channels that are 
formed from the channel Sa by fixing the second input state. 

The EPIs in eqs. (1.14) to (1.16) for continuous-variable quantum systems were proved using 
methods analogous to those used in proving the classical EPIs (1.8) and (1.9), albeit with suitable 
adaptations to the quantum setting. In contrast, the proof of our EPIs relies on completely different 
tools, namely, spectral majorization and concavity of functions. 

2 Preliminaries 

Let "H ~ be a finite-dimensional Hilbert space (i.e., a complex Euclidean space), let >C('H) denote 
the set of linear operators acting on T-L, and let T>['H) be the set of density operators or states on 7^: 

V{n)-={peC{n)-.p>0,Trp = l}. (2.1) 

Moreover, let ZT('H) be the set of imitary operators acting on "H. We denote the identity operator 
on Tf by J. A quantum channel (or quantum operation) is given by a linear, completely positive, 
trace-preserving (CPTP) map N : -E('H) —> C{1C), with "H and /C being the input and output 
Hilbert spaces of the channel. For a state p G with eigenvalues Ai,..., A^, the von Neumann 

entropy H{p) is equal to the Shannon entropy of the probability distribution {Ai,..., A^j}, i.e., 
H{p) := — Tr(p logp) = — A, log A,, where we take the logarithms to base e. 

The proof of the quantum EPIs that we propose, relies on the concept of majorization (see 
e.g. [Bha97]). For convenience we recall its definition below, making use of the following notation: 
for any vector u = {ui, 1 / 2 ,..., G IR'^ let wj > — • • • — “d denote the components of u 

arranged in non-increasing order. 

Definition 1 (Majorization). For u,v E we say that u is majorised by v and write u if 

k k 

VkG{l . d} (2.2) 

! = 1 ! = 1 


with equality atk = d. 

Definition 2. A function / : R^^ ^ R is called Schur-concave [Bha97] iff{u) > f{v) whenever u <v. 

The notion of majorization can be extended to quantum states as follows. For p,cr E we 

write p ~< (7 if A(p) ^ A(e‘), where we use the notation A(p) to denote the vector of eigenvalues of 
p, arranged in non-increasing order: A(p) := (Ai (p), A 2 (p),..., 7i^{p)) with 

Ai(p) > A2 (p) > ••• > Ad(p). (2.3) 

The following class of functions plays an important role in our paper. A canonical example of a 
function in this class is the von Neumann entropy of a density matrix. 

Definition 3. Let T denote the class of functions f : V{C‘^) —> R satisfying the following properties: 

1. Concavity;/or any pair of states prCr E V(C'^) and \/ a E [0,1]; 

f{ap + (1 - a)cr) > af{p) + (1 - a)f{a). (2.4) 
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2. Symmetry; f{p) depends only on the eigenvalues of p and is symmetric in them; that is, there exists a 
symmetric (i.e. permutation-invariant) function cpf ^ IRsuch that f{p) = (pf{A{p)). 

By restricting to diagonal states, it follows immediately that for every f ^ iF the corresponding 
function (pj is concave. In turn, this means that cpf is also Schur-concave [Bha97, Theorem II.3.3]. 


3 Main results 


We formulate a finite-dimensional version of the quantum addition rule given by eq. (1.12), which 
was introduced by Konig and Smith [KS13b, KS14] in the context of continuous-variable quantum 
systems. Our operation, which we also denote by Eh, is parameterized by fl G [0,1]. It combines a 
pair of d-dimensional quantum states p and cr according to the following quantum addition rule: 


pSaCr := ap (1 



a) i[p,cr], 


(3.1) 


where [p,cr] := pa — ap. Note that if [p,a] = 0 then p Ha n is simply a convex combination 
of p and a. In Section 4 we prove that pSa a = Sa{pi <S> pi) for some quantum channel Sa ■ 
V{C‘^ O C^^) I—> V{C‘^), see eqs. (4.16) and (4.17), implying that p Ha n is a valid state of a qudit. 
The main motivation behind introducing the map ffla is that, similar to its analogues (eq. (1.7) and 
eq. (1.12)) in the continuous-variable classical and quantum settings, it results in an interpolation 
between the two states which it combines, as the parameter a is changed from 1 to 0. 

We are now ready to summarize our main results, which are given by the following two 
theorems and corollary. 

Theorem 4. For any f ^ T (see Definition 3), density matrices p,a E I1(C‘^), and any a G [0,1], 


/(p a) > af{p) + (1 - a)f{a). 


Note that from eq. (3.1) it follows that for commuting states (and hence for diagonal states repre¬ 
senting probability distributions) this inequality is equivalent to concavity of the function /. An 
extension of Theorem 4 to three states is conjectured in [OzolS]. 

In analogy with the entropy power of p.d.f.s defined in eq. (1.1), as well as the entropy power 
and entropy photon number of continuous-variable quantum states, we use the von Neumann 
entropy of finite-dimensional quantum systems to introduce similar quantities for qudits. 

Definition 5. For any c > 0, we define the entropy power Ec and the entropy photon number Nc of 


p G as follows: 

Ecip) := (3.2) 

Nc{p) '■= g^^{cF[{p)) where ^(x) := (x-|-1) log(x-|-1) — xlogx. (3.3) 

The function ^(x) behaves logarithmically, and is bounded from above and from below as 

1 -h log(v -\-l/e) < g{x) <1-1- log(v -h 1/2), (3.4) 

from which it follows that 

exp(y - 1) - 1/2 < g~\y) < exp(y - 1) - 1/e. (3.5) 


Note that the quantity Nc{p) does not have any obvious physical interpretation for qudits. It 
is simply defined in analogy to the continuous-variable quantum setting. Our motivation for 
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Continuous 

Discrete 

Classical 
(m' dimensions) 

Quantum 
{m modes) 

Quantum 
(d dimensions) 

Entropy 

H 

/ 

/ 

/ 

Entropy 

power 

Ec 

c = 2/?n' 

c = 1/m 

0 < c < l/(logd)^ 

Entropy 

photon 

number 

Nc 

— 

c = 1/m 
(conjectured) 

0 < c < l/{d-l) 


Table 1: Summary of classical and quantum EPIs. 


looking at this quantity is that it allows us to prove a qudit analogue of the entropy photon number 
inequality (EPnl), which in the bosonic case remains an open problem. 

Elere we introduced the scaling parameter c to account for the possibility of having a dependence 
on dimension or number of modes which is different from that arising in the continuous-variable 
classical and quantum settings. Recall that the classical EPI (1.9) for continuous random variables 
on is stated in terms of Ei/rf/ while the quantum EPI (1.16) and the conjectured entropy photon 
number inequality for m-mode bosonic quantum states involves £i/m and Ni/,„, respectively (see 
Table 1). Our next theorem establishes concavity of Ec and Nc for a wide range of values of c. 

Theorem 6. For p G the following functions are concave: 

• the entropy power Ec{p) for 0 < c < 1/ (log d)^, 

• the entropy photon number Nc{p) for 0 < c < l/(d — 1). 

Since Ec{p) and Nc(p) depend only on the eigenvalues of p and are symmetric in them, the 
above theorem ensures that Ec and Nc belong to the class of functions iF given in Definition 3. Prom 
Theorems 4 and 6, and the concavity of the von Neumann entropy, we obtain the following. 

Corollary 7. For any pair of density matrices p,cr E and any a G [0,1], 

H(p fflfl n) > flH(p)(1 - fl)H(£7), (3.6) 

gcH(pffl„cr) > ^gcH(p) + (1 _ for 0 < c < 1/ (log df, (3.7) 

Nc(pffl«e-) > flNc(p)-h (1-fl)Nc(n) for 0<c<l/(d-l). (3.8) 

Henceforth, we refer to eqs. (3.6) and (3.7) as qudit EPIs and eq. (3.8) as qudit EPnI. A summary of 
values of the parameter c for which classical and quantum EPIs hold is given in Table 1. 

In addition. Theorem 4 also holds for the Renyi entropy Ha(p) of order a [Ren61], for a. G [0,1), 
the subentropy Q(p) [JRW94, DDJB14], defined as follows: 

iia{p)-.= ^\og,{Trp^), (3.9) 

Q(P) ■=-L ri (X 


1 
















pi 

Ua 


pi 





Figure 1: A comparison of a beamsplitter and the partial swap operation. 


where Ai,..., denote the eigenvalues of p. If some eigenvalues coincide (or are zero), Q(p) is 
defined to be the corresponding limit of the above expression, which is always well-defined and 
finite. The above functions are clearly symmetric in the eigenvalues of p and are known to be 
concave. Hence, they belong to the class T and thus obey the inequality in Theorem 4. 


4 An addition rule for qudit states 

In this section we show how we arrive at the quantum addition rule for qudits, (3.1), for which 
we prove a family of EPIs. This rule is based on a continuous version of the swap operation and it 
mimics the behavior of a beamsplitter. 


4.1 Beamsplitter 

Let a, b denote the annihilation operators of the two bosonic input modes of a beamsplitter and c, d 
denote the armihilation operators of the two output modes (see Fig. 1, left). Then the action of a 
beamsplitter on the input modes is described as follows [KMN^07]: 



(4.1) 


where B is an arbitrary 2x2 unitary matrix also known as the scattering matrix. In particular, let us 
choose 


/ - a\ 

\Wl - a s/d ) ' 


(4.2) 


where a G [0,1] is the transmissivity of the beamsplitter (note that this choice slightly differs from 
the one corresponding to eq. (1.11)). As a changes from 1 to 0, Ba interpolates between the identity 
matrix I and iax (where ax is the Pauli-x matrix). Indeed, we can write 


Ba = \/aI + i/l - aax- 


(4.3) 


In particular, up to an unimportant phase, Bq acts as the swap operation ax between the two modes. 
Thus, for intermediate values of a, we can interpret Ba as an operator that partially swaps the two 
modes. Following this intuition, in the next section we introduce a partial swap operation for two 
qudits (see Fig. 1, right) that mimics the action of Ba- It is also described by a unitary matrix that is 
an interpolation between the identity and a swap operation (up to a phase factor), but with a swap 
that exchanges two qudits. 
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4.2 The partial swap operator 


Let denote the standard basis of "H ~ C'^. Then is an orthonormal basis of 

T-L^T-L. The qudit swap operator S eU{'H ® 71) is defined through its action on the basis vectors 
I i, j) as follows: 

S\i,i) ■■= \i, i) for all z,;■ G {1,..., d} (4.4) 


and can be expressed as 


S = 





(4.5) 


Clearly, = S and = I. In analogy with the beamsplitter scattering matrix eq. (4.3), we define a 
qudit partial swap operator as a unitary interpolation between the identity and the swap operator. 

Since S is Hermitian, we can view it as a Hamiltonian. The evolution for time f G IR under its 
action is given by the following unitary operator, where we used the fact that = I: 


exp (its) 


CO 


E 

n—O 


ijiTru 

jt! 


I cost + iS sin t. 


(4.6) 


In particular, exp(z(7r/2)S) = iS, so exp(zfS) = Thus, as t changes from 0 to 7r/2, this 

imitary operator interpolates between I and iS, the swap gate up to a global phase. (We are 
interested only in how this matrix acts under conjugation, so the global phase can be ignored.) 
We reparametrize eq. (4.6) by (\/a, — a) = (cos t, sin t) and refer to the resulting unitary as the 

partial swap operator. 

Definition 8. For a G [0,1], the partial swap operator Ua G Lf(C‘^ C) C'^) is the unitary operator 

Ua-- V^I + iVT^S. (4.7) 


Up to the sign of i, any complex linear combination of I and S that is unitary is of this form [Ozol5]. 
Note that Ui = I while Uq = iS acts as the qudit swap under conjugation: Uo(pi ® PiJFIq = p 2 <S) pi- 

Example (Qubit case: d = 2). The matrix representation of the partial swap operator for qubits is 


Ua 


/\/a + zVl — a 0 

0 ^yd 

0 zVi — d 
\ 0 0 


0 ° \ 

zVl — a 0 

Va 0 

0 ^/fl + z'vT^^/ 


(4.8) 


4.3 The partial swap channel 

Consider a family of CPTP maps £a ■ T>{G^ C) C^^) —> parameterized by a G [0,1] and 

defined in terms of the partial swap operator Ua given in eq. (4.7). For any pi 2 G <Si 712) with 

7ii,7i2^€.f let 

^«(Fi 2) :=Tr2(U„pi2Ul), (4.9) 

where we trace out the second system. We are particularly interested in the case in which the 
input state pi 2 is a product state, i.e., pi 2 = pi® P 2 for some pi,p 2 S When £a is applied 

on such states, it combines the two density matrices pi and p 2 in a non-trivial marmer, which 
mimics the action of a beamsplitter [KS13b, KS14]. To wit, Sq {pi C) pi) = Pi and £i (pi C) pi) = Pi^ 
while for general a G [0,1] the output of £^fl(pi C P 2 ) continuously interpolates between pi and p 2 - 
The following lemma provides an explicit expression for the resulting state (this expression has 
independently appeared also in [LMR14] in the context of quantum algorithms). 
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Lemma 9. Let £ a denote the map defined in eq. (4.9) and [pi,p 7 \ '■= PiPi — pipi- Then for pi,p 2 


£a{pi<^p2) = api + (1 - a)p2 - \Ja{l -a)i[pi,p2\. (4.10) 

Remark. When {\/a, \/l — a) = (cosLsinl) for some 1 G [0,7r/2], this is an elliptic path in ^^(C^^): 

£a{pi®P 2 ) = +cos2t ■ ^ -sin2f ■ ^[pi,p 2 ]- (4.11) 

If we flip the sign of i or allow t G [—7r/2,0], we get the other half of the ellipse (see also [Ozol5]). 

Proof of Lemma 9. Using eq. (4.7) we get 

Uaipi ^p 2 )Ul = {y/al + zVl -aS){pi (gi p 2 ){VaI - iVl - aS) 

= api® P 2 + {1 - a) p 2 ® pi + i\J a{l - a) (S(pi (g) P 2 ) - (pi ®p 2 )S). (4.12) 

After tracing out the second system, the first two terms of the above expression give the first two 
terms of eq. (4.10). To get the last term of eq. (4.10), note that 

Tr2[{pl®p2)S) = E I001^l/)(2|)(-f^l^)) 

jt=l V !,;=1 / 

= E Pil 001 ^(^IP 2 |;)(/» 

= Pi E (^1 p2 |/)I0(;I = P 1 P 2 (4.13) 


and similarly Tr 2 (S(pi < 8 ) P 2 )) = P 2 Pi- Hence, 

Tr 2 (S(pi ( 8 )p 2 ) - (pi ( 8 ^ 2 ) 5 ) = P 2 P 1 -P 1 P 2 = [P 2 ,pil (4.14) 

which yields the last term of eq. (4.10). □ 

One can check that the action of the channel £„ on an arbitrary state p G 'D{C^ O C^^) (i.e., not 
necessarily a product state) can be expressed as £a{p) = EEi ^kp£^l with the Kraus operators Aj^ 
given by 

Ajt :=-\/fl t O (A:| + zVl — fl (A:| O 1 for /c G {1,... ,d}. (4.15) 

Using Lemma 9, we introduce a qudit addition rule which combines two d x d density matrices. 

Definition 10 (Qudit addition rule). For any a G [0,1] and any pi,p 2 G 'D{C'^), we define 

Pi SaP2 ■= £a{pi<S>P2) = Tr 2 (U^(pi O P 2 )Uj) (4.16) 

= api + (1 - a)p 2 - y^fl(l - a) i[pi,p 2 ]- (4.17) 

This operation is bilinear under convex combinations and obeys pi Eq P2 = P 2 and pi Hi p 2 = Pi- A 
generalization of eq. (4.17) to three states is given in [Ozol5]. 

Example (Qubit case: d = 2). Let f, ri, r 2 denote the Bloch vectors (see Appendix A.l) of states pi Ha p 2 , 
pi, p 2 , respectively. Using the properties of Pauli matrices, one can show that eq. (4.17) is equivalent to 

r = ari + {1 — a)r 2 + \J a{l — a)ri x r 2 , (4.18) 

where fi x f 2 denotes the cross product offi and r 2 . 
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4.4 Partial swap vs. mixing 

Are there any other natural operations ffln for combining two states for which the EPIs that we 
prove also hold? A trivial example is the CPTP map £a that acts on product states by mixing the 
two factors, i.e. for which 


£a{p®a) ■.= p'^aO-= ap+{I-a)a. (4.19) 

It has the following 2d Kraus operators: := {k\ and := {k\^ \/l — a I for/c G {1,... ,d}, 

and requires an ancillary qubit. Note, however, that for this choice of £a (and hence ffla) Theorem 4 
is trivial as it simply restates the concavity of the function /. 

In contrast, the partial swap channel £a has the following features: (i) it yields non-trivial EPIs 
(that are not simply a statement of concavity), and (ii) it does not require an ancillary qubit, so it 
has only d Kraus operators, the minimal number required for tracing out a d-dimensional system. 

5 Proof of Theorem 4 

In this section we prove Theorem 4, our main result. Due to the very different setup as compared to 
the work of Konig and Smith, with our addition rule acting at the level of states rather than at the 
level of field operators, our mathematical treatment is entirely different from theirs and bears no 
obvious similarity with the classical case either. Instead of proceeding via quantum generalizations 
of Young's inequality, Fisher information and de Bruijn's identity, the main ingredient in our proof 
is the following majorization relation relating the spectrum of the output state to the spectra of the 
input states. 

Theorem 11. For any pair of density matrices p,cr ^ and any a G [0,1], 

\{pSaCr) aA{p) + {1 - a)\{o-). (5.1) 

Remark. For fields corresponding to the action of a beamsplitter, the addition rule translates to 
linearly combining the covariance matrices 7 [KS14]: 

7 (pE„u) = fl 7 (p)-h (1-fl) 7 (cr). (5.2) 

When the incoming quantum fields are both Gaussian, an inequality closely related to eq. (5.1) 
holds. Denoting by v{A) the symplectic eigenvalues of a covariance matrix A, Hiroshima [Hir06] 
has shown that for any A,B>0, 


v{A + B) v{A)+v{B), (5.3) 

where -<^ stands for weak supermajorization [Bha97]. Applied to j{p) and 7 ( 0 ^), this inequality can 
be used to derive an EPI for Gaussian fields in a similar way as we have done for qudits. 

We will first show how our main result follows from Theorem 11, as this is straightforward, 
and then proceed with the proof of the latter, which is the bulk of the work. We restate Theorem 4 
here, for convenience. 

Theorem 4. For any f ^ (see Definition 3), density matrices p,^ E and any a G [0,1], 

f{p u) > af{p) + (1 - a)f{(T). 
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Proof. Assume Theorem 11 has been established. Let p, d G I?(C^) be diagonal states whose entries 
are the eigenvalues of p and a (respectively), arranged in non-increasing order. Since A{p) = A(p) 
and A((7) = A(£7), eq. (5.1) can be equivalently written as 

A(p Ha a) -< flA(p) + (1 - fl)A(u), 

= A(fl|0-|-(1 — fl)£7). (5.4) 

For any function f ^ (see Definition 3) eq. (5.4) implies that 

f{p ffla a) > /(fljO + (1 - a)d) 

>af{p) + {l-a)f{a) 

= fl/(p) + (l-fl)/(u), (5.5) 

where the first inequality follows by Schur-concavity, the second inequality follows from concavity, 
and the last line follows by symmetry. Thus, we have arrived at the statement of Theorem 4. □ 

It remains to prove Theorem 11. For this we will need the following two lemmas. 

Lemma 12 (von Neumann [vN50, p. 55]). Let C and M. he two subspaces of a vector space and let P{C) 
and P(J\A) denote the corresponding projectors. Then 

P(C nM) = lim (P(C)P(M))". (5.6) 

Lemma 13. For 0 < v, y < 1, the following inequality holds: 

xy + x{l - x)y{l - y) > min{x,y}. (5.7) 

Proof. Without loss of generality, we can assume that 0 < v < y < 1, so we need to show that 

X <xy + x{l-x)y{l-y). (5.8) 

Since x <y, we have x — xy <y — yx, orx(l — y) < y(l — x). By the above assumption, each side 
is non-negative. Taking the geometric mean of each side with x(l — y) then yields 

x{l - y) < yjx{l-y)y{l- x), (5.9) 

which is equivalent to what we had to prove. □ 

Now we are ready to prove Theorem 11. (Note that subsequently our proof has been simplified 
by Carlen, Lieb, and Loss [CLL16].) 

Proof of Theorem 11. The expression pSaCr = ap + {1 — a) a — y^fl(l — a) i[p, cr] can be written as 
follows: 

p Sa cr = a(p — p^) + (1 — a) (cr — u^) -|- (^ya p + zVl — aa)(\/ap + zVl — a cr)’’'. (5.10) 

It is convenient to express the state p u as TT^ for some 1x3 block-matrix 

T = (Ti T2 Ts). (5.11) 
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We choose T := A iB where A and B are the following 1x3 block matrices: 


(5.12) 

(5.13) 


A := ^/a (^{p-0 |o), 

B:=-\/l —0 {cr — cr^. 

Here the operator square roots are well-defined, since X > for any matrix I > X > 0. Also, note 
that all blocks of A and B (and hence of T) are Hermitian. One can easily check that 

AA'^ = a(p — p^) + ap^ = ap, (5.14) 

BB^ = (1 — a)(o- — + (1 — a)(7^ = (1 — a)a, (5.15) 

TT+ = (A + /B)(A+-iB+) 

= AA+ + BB+ - i{AB^ - BA+) 

= flp + (1 — a)a — i\J a{l — a)[p,(7] 

= p fflfl cr. (5.16) 

Given these expressions, we can rewrite eq. (5.1) as 

A(TT+) ^ A(AA+) + A(BB+). (5.17) 

If A and B had been positive semidefinite, this inequality would have followed straight-away from 
Theorem 3.29 in [Zha02]. Nevertheless, we can adapt the proof of this theorem to our needs. Note 
that 

Tr(TT+) =Tr(AA+)+Tr(BB+) -zTr[A,B] = Tr(AA+) + Tr(BB+), (5.18) 

since Tr[A, B] =0 by the cyclicity of the trace. Hence, 

Ay(TT+) = E Ay(AA+) + X: A;(BB+). (5.19) 

;=1 ;=1 /=! 

From this and Definition 1 we see that eq. (5.17) is equivalent to 

E Ay(TT+)> E Ay(AA+)+ E A;(BB+), Vk G {1,... ,d}. (5.20) 

j—d—k+l j=d—k+l j—d—k+l 

The left-hand side of the above inequality can be expressed variationally as follows (see e.g. Corol¬ 
lary 4.3.39 in [HJ12]): 

E Ay(TT+) = min{Tr(H+TT+H,) : U, G M,,,, U^U, = k}, (5.21) 

j=d—k+l 

where Miij^ denotes the set of d x k matrices, and 4 G M/^is the identity matrix. Note that the 
constraint U^Uk = 4 is equivalent to Uk being ad xk matrix consisting of k columns of a d x d 
rmitary matrix U. We can express Uj^ as = Ul^^ ct/ where 4,rf := 4 © f^d-kr with G 
being a matrix with all entries equal to zero. Hence, U^U^. = Lf4,rfH^, which is a projector of rank k. 
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Clearly, < Id, so that 


MUlTT^Uk) = X]Tr(LJ+T,T/U,) 

/=! 

3 

> Y2^r{UlTiUk UlTlUk) 

1=1 

= j^Tr[Ul{Ai + iBi)Uk UliAj - 
1=1 

= {^Tr{UlAiUkf + f^Tr{UlBiUkf - / Tr[LJ,%LJ,, 
1=1 1=1 1=1 

= f^TriUlAiUkf + f^Tr{UlBiUkf 
1=1 1=1 


(5.22) 


where we used that Aj = Ai and Bj = Bi for all 1. 

To complete the proof of eq. (5.20), we will show that Tr(Lf^A/Lfj; 
with a corresponding inequality for B following in the same way. From the definition of A we have 


£ Tr(Lf+A,Lf,)2 = fl (Tr (U+(p - p2)i/2^^)2 + 

1=1 ^ ^ 

Recall from eq. (5.14) that AA^ = ap. Therefore, we have to show that 


(5.23) 




(5.24) 


j=d—kl-l 


Let p = y^ilipi) {ipi\ be the eigenvalue decomposition of p, with the eigenvalues A, being 
arranged in non-increasing order: 


Ai > A 2 > • • • > A(j. 

Then the right-hand side of eq. (5.24) is Yfj=d-k+i while the left-hand side is 

Tr A(1 - AO Ul\iPi){tp,\Uk^ + Tr A; Ul\tp,) {tp,\Uk 
Expanding the squares gives 


Noting that 


E ( ^AOl-AOAYl-Ay) + AiAj) Tr (U+1 xpi) {% \ U^Ul \iPj) (xpj | U,) . 

L=i 

Cij : = Tr 1 ip^} {xpi \U^Ul \ ipj) {ip, | Ujt) = | I U^Ul \ipj)\^ 


(5.25) 


(5.26) 


(5.27) 


(5.28) 


is a non-negative real quantity, we can use Lemma 13 to show that the expression (5.27), and hence 
the left-hand side of eq. (5.24), is bounded below by 


X] min{A,,AY Qy. 

L=i 


(5.29) 
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Let A be the matrix whose elements are 

A,' : 

= min 

{A/, Ay}: 


Ml 

Ai 

A3 • 

•• AA 


A2 

Ai 

A3 • 

•• A, 

A = 

^3 

^3 

A3 • 

•• Ad 



^d 

Ad ■ 

■ • ^d) 


For m G {1,... ,d}, we define matrices Em of size d x d such that 


■ — 



for 1 < i,] < m 
otherwise. 


Then we can write 


Hence, 


d-l 

m=l 


(5.30) 


(5.31) 


(5.32) 


d 

Y, min{Ai,Ay}Qy 
y=i 


d 


Y — 

y=i 


d d-l 

^d ^ {^d ° ^}ij 3“ Xj '^m+l) 

i,j=l m=l 


d 

Yj o C)ij, 


(5.33) 


where we use the notation {Ao B)ij := AijBij for d x d matrices A and B. 

If we define n{m) := Y 4 ,j=i{^m ° C)ij = Qj, we can write eq. (5.33) as 


d d-l 

Y min{A,-, Ay} Qy = Xdn{d) + X] - A;„+i)7r(m). (5.34) 

i,j=l m—1 

Recall from eq. (5.25) that the eigenvalues A, are arranged in non-increasing order, so all coefficients 
A(^ and Am — A^+i are non-negative, so it only remains to find a lower bound on n{m). 

Recall from eq. (5.28) that 

mm. . 

Y Cij = Y T^^[Ul\xPi){ip,\UkUl\xPj){iPj\U,) 

i,j=i i,j=i 

= Tr {UlQmUkUlQmUk) 

= Tr{PkQm)\ (5.35) 

where ■= UkU^ and Qm '■= YT=i \ rank-A: and rank-m projectors, respectively. Note 

that Tr (Pj-Qm)” is monotonically decreasing as a function of n G IN, so 

Tr(P;tQm)" > lim Tr(P;,Q^)" = Tr lim {PkQmT = TrR (5.36) 

n^co n^oo 

where R := lim„^oo(PfcQm)”- If Sk and Sm are the subspaces of C‘^ corresponding to projectors Pk 
and Qm respectively, then, by Lemma 12, R is the projector onto Sk^Sm- Since dim5jt = k and 
dim Sm = ni, we get 

Tr R = dim(5fc n Sm) > max{0,A + m — d}. (5.37) 
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(5.38) 


Putting everything together, we obtain 

m 

n{m) = ^ Cij > max{0,/c + m — d}. 

When we substitute this in eq. (5.34), we get 

d d—1 

^ min{A,, Ay} Qy > A^/c + ^ (A^ - + m-d). (5.39) 

i,j=l m=d—k+l 

The right-hand side of the above inequality is simply equal to 

d 

i^d-k+l ~ ^d-k+l) +'^{^d-k+2 — ^d-k+3) + ' ' ' + (k — 'i^)i^d-l — ^d) ^ Ay, ( 5 . 40 ) 

j=d—k+l 

which proves eq. (5.24) and therefore the theorem. □ 


6 Concavity of entropy power and entropy photon number 


In this section we prove Theorem 6, which establishes concavity of the entropy power Ec and 
the entropy photon number Nc for qudits (see Definition 5). Note that both Ec{p) and Nc{p) are 
twice-differentiable and monotonously increasing functions of the von Neumarm entropy H(p). 
Hence, our strategy for establishing Theorem 6 is to solve the following more general problem. 

Problem. Let h : IR ^ Mbe any twice-differentiable and monotonously increasing function. For which 
values ofc > 0 is fdp) ■— k{cH{p)) concave on the set of d-dimensional quantum states? 

Since H{p) is already concave, the function/^(p) = h{cH{p)) is guaranteed to be concave for 
any c > 0 whenever h is monotonously increasing and concave. However, there are many more 
functions h which are not necessarily be concave—in fact, they could even be convex—yet produce 
a concave function fc for a limited range of constants c. Our goal is to obtain a condition on pairs 
{h, c) under which the function fc is concave. 

To prove the concavity of fc on 73(0'^), we fix any two states p,(T E 73(0“^) (we assume without 
loss of generality that p and a have full rank—the general case follows by continuity). We then 
define a function u : [0,1] —?■ IR as follows: 

w(p) :=/c(pP + (l-p)^) (6.1) 

(note that u{p) implicitly depends also on c). Our goal now is to determine the range of values of c 
for which 

m"(p) < 0 Vpe[0,1] and Vp,cr € 73(C‘^). (6.2) 

This would imply that u{p) is concave and, in particular, that u{p) > pu{l) -|- (1 — p)w(O), which 
by eq. (6.1) is equivalent to concavity of fc- The following lemma uses this approach to obtain the 
desired condition on {h,c). 


Lemma 14. Let h : IR ^ R be any twice-differentiable, monotonously increasing function. Then the 
function fc{p) ■= h{cH{p)) with c > 0 and p G 73(C‘^) is concave on the set of quantum states 'D{C‘^) if 
for any probability distribution q = (qi, ..., qfj, the following condition is satisfied: 


"{cH{q)) ^ 1 

h'{cH{q)) - L{q)-H{qY 


where H{q) = —Jf-^iqilogqi is the Shannon entropy of q and L{q) := (logqid. 
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To prove this lemma we employ the following definitions and results from [Audl4]. For 
operators A, A G where A > 0 and A is Hermitian, the Frechet derivative of the operator 

logarithm is given by the linear, completely positive map A i—>• 7 a (A) [Audl4], where 


rA(A) 


d 

dt 


log(A + 1^), 
t=o 



ds(A + sI)-^A(A + sI)-\ 


(6.4) 


Here the second line follows from the integral representation of the operator logarithm. 


log^ = /As(^I-A + s/)->) 


for any A > 0, 


(6.5) 


and the fact that 

^(A + fA)^^ - -(A + fA)“^A(A + tAy\ (6.6) 

When A and A commute, the integral in eq. (6.4) can be worked out and we get Ta{A) = AA^^. 

It is easy to check that the map 7 a(A) is self-adjoint, i.e., for any B G >C(7f), 


Tr(BrA(A)) =Tr(rA(B)A), 


(6.7) 


and that 

rA(A) -L 

This linear map induces a metric on the space of Hermitian matrices given by 


Ma(A) :=Tr(ArA(A)). 


( 6 . 8 ) 

(6.9) 


This metric is known to be monotone [Audl4]; that is, for any completely positive trace-preserving 
linear map A, 

Ma(a)(A(A)) <Ma(A). (6.10) 

Now we are ready to prove Lemma 14. Our proof will proceed in two steps: first we will 
reduce the problem from general quantum states to commuting ones, and then restate the concavity 
condition for commuting states in terms of a similar condition for probability distributions. 

Proof of Lemma 14. Let A := p — a and ^ := pp + {1 — p)cr = cr + pA. Note that := = A and 

= 0. Recall from eq. (6.2) that concavity of/c is equivalent to u''[p) < 0 where 

u{p)-=fc{f,)=h{cH{^)). (6.11) 

To compute u" {p), we will need to find the first two derivatives of H(^) = — Tr(^log^) with 
respect to p. Noting that 

Llogf = Tj{S') (6.12) 

and using eq. (6.12), we find that the first derivative of H(^) is 

^H(0 = -Tr(^'log^)-Tr(^r^(^')) 

= -Tr(riog^)-Tr(rf(Or) 

--Tr(^'log^)-Tr^' 

= -Tr(riog^). (6.13) 
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In the first line we used the Frechet derivative of the logarithm as given in eq. (6.12), while the 
second line follows from the self-adjomtness (6.7) of the map 7^. The last two lines follow from 
eq. (6.8) and the fact that Tr ^' = Tr A = 0. The second derivative is 

^H(^) = -Tr(riog^) -Tr(rrf(r)) = -M^in (6-14) 

where the first term vanishes since = 0 while the second term produces (^') by eq. (6.9). 

We are now ready to calculate the second derivative of u{p) — h{cH{^)) introduced in eq. (6.11). 
By the chain rule. 


u'{p) = ch'{cH{^)) 


dHjO 

dp 


u"{p) = cV(cH(^)) 
Therefore, u"{p) < 0 is equivalent to 


dH{i) 

dp 


ch'icHiO) 


d^H{^) 

dp^ 


ch^cHi^)) [Tr(^'logO]' < 


(6.15) 

(6.16) 

(6.17) 


where we divided by c > 0 (the case c = 0 is trivial) and substituted the derivatives of H(^) from 
eqs. (6.13) and (6.14). Since we imposed the condition that h is monotonously increasing, we can 
divide by h' and get the condition 


H"{CH{()) M;(A) 

- |Tr(Alog?)l"' ' 

By fixing the state ^ and minimizing the right-hand side over all A, we get a stronger inequality, 
which in particular implies eq. (6.18). Consider the dephasing channel A := diag^ which, when 
acting on an operator A, sets all its off-diagonal elements equal to 0 in any basis in which ^ is 
diagonal (in particular, in its eigenbasis). Thus, diag^(^) = ^ and 

M^(diag^(A)) <Mf(A), (6.19) 

by the monotonicity property (6.10) of the metric M^(A) under CPTP maps. Hence, on replacing 
A by diagj(A) on the right-hand side of eq. (6.18), the denominator remains the same but the 
numerator does not increase. Since [diag^(A), = 0, to obtain the minimum value of the right- 
hand side of eq. (6.18), it therefore suffices to restrict to those A which commute with 
Recall that Tj(A) = A^^^ for commuting ^ and A, so 


M^(A) = Tr(Arf(A)) = (6.20) 

;=1 

where and ci; for i G {l,...,d} are the diagonal elements of ^ and A in the eigenbasis of ^ (in fact, 
are the eigenvalues of ^). We can now phrase the problem of minimizing the right-hand side of 
eq. (6.18) as follows: 


minimize 


LUlf/S, 


subject to 




i=l 


( 6 . 21 ) 
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where the condition Si = 0 arises from the fact that Tr A = 0. 

Since the objective function in eq. (6.21) is invariant under scaling of all Si by the same scale 
factor, we can convert the minimization problem to the following one: 

d d d 

minimize Y^Sf/^i subject to ^^; = 0 and £<S,logJ,. = l. (6.22) 

i=l i=l !=1 

Using the method of Lagrange multipliers, we form the Lagrangian 

C-.= jYs}/ir-2kj^ Si - 2^ ( £ Sr log - 1). (6.23) 

!=:1 ! = 1 = l / 

To find its stationary points, we require that dCldSi = 0 for all i. This implies 

Si = U2^ + ^\o^^i). (6.24) 


To find the Lagrange multipliers A and }i, we substitute the Si back into the constraints of the 
optimization problem (6.22). We get the following equations: 


where H := — ^i log and L := Ya=i Their solution is 


^ = 


A = 


H 


L-H^' L-H^' 

Inserting eqs. (6.24) and (6.26) back in the objective function of eq. (6.22) yields 

i=l !=1 

= A^ — 2kjiH + 

_ 1 
“ L-H2‘ 


(6.25) 


(6.26) 


(6.27) 


Thus, eq. (6.18) is satisfied whenever 


U'(cH) ^ 1 

h'(cH) - L-H^' 


(6.28) 


Note that H = H{q) and L = L{q) where q := {qi,... ,qd) with qi := ^i is a probability distribution. 
Thus, condition (6.3) implies eq. (6.28) and hence the concavity of fc- □ 

The quantity L — ¥p- arising on the right-hand side of eq. (6.28) is known as the variance of the 
surprisal {—logqi) [RW15, PPVIO]: 

qi{-\ogqiY- 


V{q) ■.= L{q)-H{qY = T 


qf-log qi) 


i = 1 


(6.29) 


To find the optimal value of c for which eq. (6.28) holds, we need to minimize its right-hand side 
over all attainable values of the quantity L — ¥P- for a fixed value of H. In other words, we require 
the maximum attainable value of L{q) — H{q)^ over all probability distributions q over d elements 
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with a fixed value of the entropy H{q) = Hq (in contrast, ref. [RW15] evaluated the maximum 
value oiV{q) without the constraint of H{q) being fixed). We define 

hmax(hfo) :=max|L((j) : H{q) = Hq, Y^qi = 1, and qi > Oforallzj. (6.30) 

To obtain this value and the corresponding optimal distribution q, we employ the following lemma. 

Lemma 15. The maximum of L{q) := {log qiY over all probability distributions q = {qi,---,qd) 

with fixed Shannon entropy H{q) = Hq E [0,logd] is achieved by a distribution of the form 

q = {x,... ,x,y) for some f) < x <y such that {d — l)x + y = 1. (6.31) 

If we let r := d — 1, then the value ofL{q) achieved by this distribution is 

Lmax(Ho) = rx{l -rx)(logx-log(l -rx))^ + Hq. (6.32) 

Proof. For given Hq G [0,log d], we need to solve the following constrained optimization problem: 

d d d 

maximize ^^/(logy/)^ subject to = 1 and — ^ y, logy, = Hq. (6.33) 

;=1 ;=1 


Since the domain of the logarithm is R+, we do not have to explicitly impose the condition that 
y, > 0 for all 1 < z < d. 

The maximum of a continuously differentiable function / over a domain D either occurs at a 
stationary point of /, or on the boundary of D. In the present case D is the probability simplex, hence 
its boundary consists of probability vectors where some of the y, are zero. Due to the fact that both 
—y,- log y,- and y, (log y,)^ are zero for y,- = 0, such points can be conveniently modeled by treating 
them as probability vectors in a lower-dimensional probability space. We can therefore safely 
assume that the sought-after maximum occurs at the relative interior of a fC-dimensional probability 
simplex (with K < d), and at the very end of the calculation perform a further maximization over 
K. In particular, it will turn out that the global maximum occurs for K = d. 

The aforementioned maximum can be found as a stationary point of the Lagrangian 


1=1 


C:=Y^ y,(log qif + A ( £] y,- - 1 ) - y ( y,- log y,- + H, 


! = 1 


i=l 


(6.34) 


Requiring that all derivatives dC/dqi be zero yields the equations 

(log qif + {2 - p)\ogqi + \ - p = 0. 


(6.35) 


As this is a fixed quadratic function of log y„ and therefore may have at most two solutions, we infer 
that the stationary points of C are those distributions y whose elements are either all equal (and 
hence equal to 1/fC) or equal to two possible values. That is, up to permutations, the distribution y 
can be uniquely represented as 

qk,x ■ = J// • • • / y) (6.36) 

k K-k 

for some integer k G ,K} and some probabilities 0 < x < y such that 

kx + {K- k)y = 1. (6.37) 
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L(k, X) 



Figure 2: The locus of the points {H,L) as x varies over [0,1/K], for the case K = 6 and for 
each value of /c G {1,.. .,5}, with k increasing towards the left. These loci are curves with 
lower end point H = H{k,0) = log(fC — k),L = L{k,0) = (log(fC — k))'^ and upper end point 
H = H(k,l/K) = logfC, L = L(k,l/K) = (logfC)^. The value/c = 0 yields a single point, just as 
the k = K case does, coinciding with the upper end point of all other {H,L) curves. 


From this we get in addition that x < 1/K < y. For k = K, there is only one distribution of this form, 
namely, the uniform distribution = {1/K,... ,1/K). This distribution has H{qK,x) = logl^ and 
k{qK,x) — (log KY, which are independent of x, so there is nothing to optimize in this case. 

From now on we assume that k ^ K and thus Hq < log K. Then y ■= (1 — kx) / {K — k) from 

the normalization constraint (6.37), so we can compute 

H{k,x) := H{q]^x) = —kxlogx— {K — k)y\ogy, (6.38) 

L{k, x) := L{qk,x) = kx{\og xf + {K - k)y{\og yf. (6.39) 

To obtain the global maximum of L{k,x), a further optimization over k G {1,... ,1<C — 1} and 
X G [0,1/fC) is required. The numerical calculations presented in the diagram in Fig. 2 suggest that 
k = K — 1 yields the maximal value of L. To prove that this is actually true we will temporarily 
remove the restriction that k be an integer and consider the entire range k G (0, K). Our analysis 
will show that keeping H{k, x) fixed, L{k, x) increases with k. 

To keep H fixed as k changes, x will have to change as well. For given Hq G [0, log K), let x{k) 
be the function of k implicitly given by H{k, x{k)) = Hq. We would like to know how L{k, x{k)) 
changes as a function of k. Taking the total derivative with respect to k gives 

^H{k,x{k)) = ^H{k,x{k)) + ^H{k,x{k)) x\k) = 0, (6.40) 

^L{k,x{k)) = ^L{k,x{k)) + ^L{k,x{k)) x\k). (6.41) 
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Solving the first equation for x' [k) and substituting the solution in the second equation gives 


^L(k,x(k)) 


^ L{k,x{k)) - ^L{k,x{k)) 


dk 


K-k 


{K-2k)x) (log 


^H{Kx{k)) j ^H{Kx{k)) 

1 _ ley \ 

^ -logx) -2{1-Kx) 


(6.42) 


where the second line follows by substituting the partial derivatives of H(k,x) and L(k,x) defined 
in eqs. (6.38) and (6.39). 

Note that z > tanhz = — 1)/ + 1) for any z > 0. By choosing z := (log w)/2 we get 

logic > 2{w — l)/{w + l) for ic > 1. Next, since x < 1/K, we can take w := {1 — kx)/{Kx — kx) > 
1 which gives 


log 


1 — kx 
K-k 


logx > 2 


1 — kx 
{K — k)x 



/ 1 — kx 1 — Kx 

\{K-k)x ^ ) ^^l + (K-2k)x' 


(6.43) 


Inserting this in eq. (6.42) and noting that 1 + (fC — 2k)x > 0 for x < 1/K, we conclude that 
dL{k,x{k))/dk > 0, so L{k, x{k)) is increasing as a function ofkjust as we intended to show. 

Reverting back to integer values of k, we find that, for a fixed value of H{k,x), the value 
of L(k,x) is maximized when k is the largest integer in the open interval (0,R), namely fC — 1. 
Then Lmax(hfo), the maximum value of L{k,x) subject to H{k,x) — Hq, see eq. (6.30), is given by 
L(K — 1, x) where x is such that H(K — 1, x) = Hq. 

Finally, we have to perform a further maximization over K, for fC < d. In a similar way as before 
X becomes a function of K. We now show that the maximum of L(fC — l,x(K) ) under the constraints 
H(K — l,x{K)) = Hq and K < d occurs for K = d. Solving the equation 0 = ^H{K — l,x{K)) for 
x'{K) and substituting the solution back into ^L{K — l,x{K)) shows after a fair bit of algebra that 


d 

dK 


L{K-l,x{K))=x{K) 2 + log(l-R + l/x(R)) 


(6.44) 


which is clearly non-negative for 0 < x{K) < 1/K, hence L{K — l,x{K)) increases with K. We 
conclude that the overall maximum occurs for K = d, as we set out to prove. 

The last statement of the lemma is now easily shown. From eq. (6.39) we infer that 

Tmax(Ho) - Hq = L(d -l,x) - H{d - l,x)^ = rx{l - rx){\ogx- log(l -rx))^, (6.45) 

where r := d — 1 and x G [0,1/d] satisfies H{d — 1,x) = Hq. □ 


For r = d — 1 and any x G [0,1/d], if qr,x is the probability distribution defined in eq. (6.36), we 
denote its Shannon entropy and the information variance by 


Sr(x) := H{qr,x) = —rxlogx — (1 — rx) log(l — rx), (6.46) 

Wr{x) := V{qr,x) = L{qr,x) - H{qr,x)^ = rx(l - rx) (logx - log(l - rx))^. (6.47) 


In terms of these quantities, the condition in Lemma 14, under which a given function of the 
von Neumarm entropy is concave on the set of qudit states, is expressed by the following theorem. 


Theorem 16. Let h : M ^ Ube a twice-differentiahle, monotonously increasing function. Then the function 
fc{p) ■= h{cH{p)) with c > 0 and p G V{C‘^) is concave on 79(C'^) if 


h'fcsrjx)) ^ 1 

^ h'{cSr{x)) ~ Wr{x)' 


(6.48) 


for all 0 < X < 1/d, where r := d — 1 and functions Sr{x) and Wr{x) are defined in eqs. (6.46) and (6.47). 
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6.1 Concavity of Entropy Power 

In this section we use Theorem 16 to establish the first item of Theorem 6, namely, that the entropy 
power Ec{p) = of a state p G V{C‘^) is concave for 0 < c < 1/(logd)^. 

Proof of Theorem 6 (concavity ofEc). In this case we have h{x) = exp(v), so the condition (6.48) just 
translates to 

c < 0 < V < 1/d. (6.49) 


Therefore, 

/ \ -1 


c< max Wci^i{x)) =: Cmax- (6.50) 

Vo<A:<l/d / 

From the expression of w,- follows a simple lower bound on the largest allowed value of c. Putting 
y = rx, with 0 < y < {d — T)/d <1, 


2 

Wr{x) =y(l -y)(-log r +logy-log(l -y)) 
= A(logr)^ + Blogr + C, 


(6.51) 


where the coefficients A, B, C of this quadratic polynomial in log r are bounded above as follows: 
^ := y(l-y) < 1/4, B := -2y(l - y) (logy - log(l - y)) < 1/2, and C := y(l-y)(logy- 

log(l — y))^ < 1/2. Hence, 

< (1 + logr)/2 + (logr)^/4 withr = d —1, (6.52) 

and we obtain 

Cmax > l/[(l+log(d-l))/2+ (log(d-l))V4]. (6.53) 

This bound becomes asymptotically exact in the limit of large d. Note that the right-hand side of 
eq. (6.53) is larger than 1/ (log d)^ for d >3. For d = 2, the right-hand side of eq. (6.53) is equal to 2 
which is not larger than 1/ (log 2)^ ft! 2.0814. However, for this case one can numerically evaluate 
the expression (6.50) for Cmax to obtain the value 2.2767, which is indeed greater than 1 / (log 2)^. □ 

From this we can also infer that for any probability distribution p over d elements, the function 
E{p) := is concave for 0 < c < 1/(logd)^. 

Remark. The fact that for h{x) := exp(v) the inequality (6.3) of Lemma 14 holds for any value of c 
in the range 0 < c < 1/ (log d)^ can also be proved using Theorem 8 and Lemma 15 of [RW15]. 


6.2 Concavity of Entropy Photon Number 

In this section we use Theorem 16 to establish the second item of Theorem 6, namely, that the 
entropy photon number Nc{p) of a qudit, defined by eq. (3.3), is concave for 0<c<l/(d — 1). 

Proof of Theorem 6 (concavity ofNc). In this case the calculations are more complicated because h is 
not given directly but as the inverse of a function: h = g~^, where 

g{x) = -X log(x) -6 (1 -6 x) log(l -6 x). (6.54) 


23 


My) 

1 . 01 - 



Figure 3: Parametric plot of k{y) versus g{y)- 


The derivatives of h are given by 

1 1 

^ ~ ~ iog(i + i/Hx))' 

h"{x) = - - - 

{h{x) + h^{x)) [log(l + l/h{x))] 


(6.55) 

(6.56) 


Defining the function 

k(x) ^ x{l + x) (log(v) - log(l + x))^, (6.57) 

we have h'(x)/h"(x) = k(h(x)). The function k is monotonously increasing, concave, and ranges 
from 0 to 1. The condition on c becomes 


g ^(csr{x))>k ^{cWr(x)). (6.58) 

If we define the variables y and z according to 

g(y) = CSr(x), k{z) = CZVr(x), (6.59) 

the condition is y > z. If we now exploit the monotonicity of k, this condition is equivalent to 
k(y) > k(z) = czvr(x). We therefore require that 

g(y) = csr(x) implies k(y) > czvr(x). (6.60) 

We will show that this holds for c < 1/fl = 1/(d — 1). In Fig. 3 we depict the graph of k(y) 
versus g{y). The graph seems to indicate that the resulting curve is concave and monotonously 
increasing; that this is actually true follows from the easily checked fact that the function k'/g' = 
(1 + 2x) (log(l + x) — logx) — 2, representing the slope of the curve, is positive and decreasing. 
The condition (6.60) amounts to the statement that any point {cSr{x),cWr{x)) lies in the area below 
this curve. Flence if the condition is satisfied for a certain value of c, then it is also satisfied for any 
smaller positive value of c. Therefore, we only need to prove eq. (6.60) for c = l/(d — 1). 

The formal similarities between g and s,- and between k and Wr let us define two interpolating 
functions gi (x, b) and ki (x, b) as a function of the original x and an interpolation parameter b\ 

(x, b) = -X log X + (1 + bx) 
ki(x,b) = x(l + bx) (logx — log(l + bx))^. 
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(6.61) 

(6.62) 








Let S indicate the domain of and ki, which is & G [1 — d, 1] and x G [0,1/d], as before. To ensure 
continuity of gi at b = 0, we define {x, 0) to be its limit value (—x log x + x). Hence, we have the 
correspondences 


s,(x)/(d-l) =^i(x,l-d), g{x) = gi{x,l), (6.63) 

Wr{x)/{d — 1) = ki{x,l — d), k{x) = ki{x,l). (6.64) 

The condition (6.60) is therefore satisfied if a continuous path x{b) exists (from x(l — d) = x to 
x(l) = y) such that gi {x{b), b) remains constant and ki {x{b), b) increases with b. As in the proof of 
Lemma 15 this requires the positivity of 


^ki{x{b),b) = ^ki{x{b),b) - ^ki{x{b),b) ^gi{x(b),b) j^gi{x{b),b) 

= p- &x(2 + logx + &x(logx)^) 

+ (1 + &x)^(log(l + bx))^ 

— ^2 + &x + (1 + 2bx + 2b^x^) log x^ log(l + bx) . 


(6.65) 


Let us introduce the variable w = 1 + bx. In S we have (1 — b)x < 1 so that x < u; furthermore, 
b < 1 and x < 1 / d, so that u < 1 + 1 / d. The second factor can now be written more succinctly as 

(m — 1) (2 + logx + (w — l)(logx)^) + M^(logM)^ — (1 + M + (1 — 2m + 2m^) logx) log u 
— (m — l)^logx(logX — logM) + M^(logM)^ 

+ 2(m — 1) — (m + 1) logM — (1 — M + M^log u) logx. (6.66) 

The first two terms are clearly non-negative. The factor 1 — m -|- M^ log u is non-negative too, as can 
be seen from the inequality 1 —exp(—u) <v < uexp(u) applied tou = logM. Furthermore, log x < 
log(l/d) < log(l/2) < —1/2, so that the last term is bounded below by (1 — M-|-M^ log m)/2. It is 
therefore left to show that 2(m — 1) — (m -|- 1) log m -|- (1 — m -|- m^ log m)/2 is non-negative. 

For 0 < M < 1 we can exploit the inequality logM < 2 (m — 1)/(m -|- 1), so that we obtain 
2 (m — 1) — (m -|- 1) log M > 0. The remaining term is non-negative too, as we have just showed. 

For l<M<l-|-l/d we exploit instead the inequality log u < u — 1. Then based on the fact 
that in this range (m — 1)^ — 3 < 0 

2(m — 1) — (m -|- l)logM -|- (1 — M -|- M^logM)/2 
= ^(3(“-l) + ((w-1)^-3) logM^ 

> \ (3(“ - 1) + ((w - 1)^ - 3) (m - 1)) 

= (m - lf/2 > 0. (6.67) 

This shows that k-[{x{b),b) indeed increases with b, whence condition (6.60) holds for c = 
1/(d — 1) and, by a previous argument, for c < l/(d — 1). In other words, we have shown that 
the function g^^{cH{p)) is concave for 0<c<l/(d — l).As this includes the value c = 1/d, the 
photon number is concave. □ 
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£a,a{p^ ■— P fflfl ^ 


£a,c 


Figure 4: A schematic representation of the channel £a,a defined in eq. (7.1). 

7 Bounds on minimum output entropy and Holevo capacity 

As an application of our results we now consider the class of quantum charmels £a,a ■ > 

'D{C'^) obtained from the partial swap channel £a from eq. (4.9) by fixing the second input state a 
(see Fig. 4). Such channels are parameterized by a variable a G [0,1] and a quantum state cr G 
and act as follows: 

£a,a{p) ■= pSaO-. (7.1) 

For example, for the choice a = I/d (the completely mixed state) the charmel £a,(j is just the 
quantum depolarizing channel with parameter a. If tr = ^|0) (0| + (1 — ^) |1) (11 G 77(C^) for some 
S G [0,1], then Sa,cr is a qubit channel whose output density matrix is 


aroo + (1 - a)3 roi(fl - - a){l- 2S)) 

roi{a + i^/a{l- a){l - 23)) aru + {1 - a){l - 3) 


(7.2) 


for any input qubit state p : = 'E},j=o ^ijV) Ol- 

An important characteristic quantity for any quantum charmel £ is its minimum output entropy, 
which is defined as 

Hmm{£) ■= minH{E{p)). (7.3) 

P 

Lower bounds on this quantity for the class of channels £a,o- can be obtained by using our EPIs and 
EPnI. In fact, the inequalities of Corollary 7 give various lower bounds on the output entropy of the 
channel £a,o- (i-e. the entropy of any output state) in terms of the entropy H{p) of an input state p: 


H{£,Ap))>aH{p) + {l-a)H{<7), 

H{£a,a{p)) > ^log flexp(cH(p)) + (1-fl)exp(cH((7)) 

HiSaAp)) > lgUg^HcH{p)) + (1 - a)g-\cH{cr)) 


(7.4) 

with c = 1 / (log d)^, (7.5) 

with c = l/(d — 1). (7.6) 


Since the above bounds are of the form H{£a,a{p)) > G(H(p)), for some function G, we have 

dlmin (£aA > mmG(H(p)) 


= min G(Ho). 

0<Ho<logd 


(7.7) 


In Fig. 5 we have plotted the bounds G(Ho) for two illustrative cases, the three curves corre¬ 
sponding to the three choices of the function G as given by the right-hand sides of eqs. (7.4) to (7.6). 
For the qubit {d = 2) case we actually have a tight lower bound 


H{£aAp)) > ^ {ti{p)) + (1 - a)r^ {h{ct)) 


(7.8) 
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Figure 5: Plots of bounds G from eq. (7.7) for the channel where a is the maximally mixed 

state u = f/d in dimensions d = 2 (left panel) and d = 4 (right panel). The blue curves represent 
the boimd (7.4) obtained from eq. (3.6), the orange curves represent the boimd (7.5) obtained from 
the entropy power inequality (3.7), and the green curves represent the bound (7.6) obtained from 
the entropy photon number inequality (3.8). For d = 2, the optimal bound (7.8) is given by the pink 
curve in the left panel. While neither of the bounds in eqs. (7.4) to (7.6) is optimal for this channel, 
the numerics suggest that the entropy photon number inequality is the best out of the three when 
d > 4. For d = 2, however, the entropy power inequality (7.5) yields a better bound. 

where £{r) is the entropy of a qubit state whose Bloch vector has length r, see eq. (A.5) in Ap¬ 
pendix A. This bound follows from eq. (A. 14) in Appendix A and is also shown in Fig. 5. 

These bounds imply lower bounds on the minimum output entropy which in turn 

allow us to obtain upper bounds on the product-state classical capacity of £a,o-- The latter is the 
capacity evaluated in the limit of asymptotically many independent uses of the channel, under 
the constraint that the inputs to multiple uses of the channel are necessarily product states. The 
Holevo-Schumacher-Westmoreland (ITSW) [Hol98, SW97] theorem establishes that the product- 
state capacity of a memoryless quantum channel £ is given by its Holevo capacity x{£)- 

X{£) ■= max !. H(Tpi£{pi)) - ^p;H(£:(pi)) |, (7.9) 

{PuPi} I ^ i ^ i J 

where the maximum is taken over all ensembles {puPi} of possible input states pi occurring with 
probabilities pj. Using the above expression, and the fact that H{co) < log d for any co G 'D{C'^), 
we obtain the following simple bound: 

X{£) < log d-min H{£{p)), (7.10) 

where the minimum is taken over all possible inputs to the channel. Applying this bound to the 
channel £a,a for any a G [0,1] and cr G 'D{C‘^) and using eq. (7.4) we infer that 

x{£a,o-) < logd — flminH(p) — (1 — a)H{a) 

= logd-(1-fl)H((r). (7.11) 

For the case of the qubit channel introduced above, we thus obtain the bound 

x{£a,a) <log2-{l-a)h{S), (7.12) 

where h{3) := —3 log d — (1 — d) log(l — d) is the binary entropy. Even sharper boimds are possible 
by exploiting eqs. (7.5) and (7.6). 
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8 Summary and open questions 

In this paper we establish a class of entropy power inequalities (EPIs) for cf-level quantum systems 
or qudits. The underlying addition rule for which these inequalities hold, is given by a quantum 
channel acting on the product state p® a of two qudits and yielding the state of a single qudit as 
output. We refer to this channel as a partial swap channel since its output interpolates between the 
states p and a as the parameter a on which it depends is changed from 1 to 0. We establish EPIs not 
only for the von Neumann entropy and the entropy power, but also for a large class of functions, 
which include the Renyi entropies and the subentropy. Moreover, for the subclass of partial swap 
channels for which one of the qudit states in the input is fixed, our EPI for the von Neumann 
entropy yields an upper bound on the Holevo capacity. 

We would like to emphasize that the method that we employ to prove our EPIs is novel, in the 
sense that it does not mimic the proofs of the EPIs in the continuous-variable classical and quantum 
settings. Instead it relies solely on spectral majorization and concavity of certain functions. 

8.1 Open questions 

Our results lead to many interesting open questions; here we briefly mention some of them. Por 
example, can a conditional version of the EPI (see [KoelS]) be proved for qudits? Can an optimal 
bound similar to eq. (7.8) be found also for d > 2? Is it possible to generalize our quantum addition 
rule (4.17) for combining more than two states? Such a generalization has recently been obtained 
for three states [OzolS], though the problem for four or more states is not yet fully resolved. More 
importantly, proving analogues of our EPI for three or more states (similar to the multi-input EPI 
of [DMLG15]) remains an interesting open question. Pinally, is the partial swap channel that we 
define the unique channel resulting in an interpolation between the input states and yielding a 
non-trivial EPI (i.e., one that is not simply a statement of concavity)? According to [OzolS], it is 
unique (up to the sign of i) in a certain class of channels. 

In Section 7, we mentioned a simple application of our EPI to quantum Shannon theory. 
Considering the significance of the classical EPI in information theory and statistics, we expect that 
our EPIs will also find further applications. 

Pinally, it would be worth exploring whether our proof of the qudit analogue of the entropy 
photon number inequality can be generalized to establish the EPnl for the bosonic case (which is 
known to be an important open problem). 
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A Entropy power inequality for qubits 

For the case of qubits (d = 2), there is a simple proof of eq. (3.6) which exploits the Bloch-vector 
representation of a qubit state. 

A.1 Qubit states and the Bloch sphere 

It is known that the state p of a qubit can be expressed in terms of its Bloch vector f as follows: 

1 1 

P ^ + ■^) = ^{I + xcTx + ycTy + zaz), (A.l) 

where r : = {x,y,z) e ]R^ such that \r\ := < 1. Here r ■ a denotes a formal inner 

product between r and a := {ax, ay, az), with ax, ay and az being the Pauli matrices. Moreover, the 
eigenvalues of the state p can easily be seen to be given by ^ (1 ± |r |). Hence, its von Neumann 
entropy is simply 

H{p) = h{\{l + \r\)), (A.2) 

where h{p) := —p log p — {1 — p) log(l — p) is the binary entropy of p G [0,1] in nats. For x e [-L1], 
let us define the function 

£{x) := h{j{l + x)). (A.3) 

One can easily see that £ is symmetric around the vertical axis and verify that 

= (A.4) 

SO £ is concave (see Fig. 6). In terms of this function, eq. (A.2) is given by 

H{p)=e{\r\). (A.5) 

A.2 Proof of the qubit EPI 

For a pair of qubit states pi and p 2 , the first EPI of Corollary 7 is given by 

H(pi ffl«p2) > flH(pi) + (1-fl)H(p2), Vfle[0,l]. (A.6) 

Below is a simple proof of the above inequality for the special case of qubits. 

Proof. Using eq. (A.5), the inequality (A.6) can be expressed in terms of the function £ as follows: 

£{r) > a£{ri) + (1 - a)£{r 2 ), (A.7) 
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Figure 6 : A plot of the function t defined in eq. (A.3). 

where r := \ f\, ri := \fi\, '■= \r 2 \, and f, ri, denote the Bloch vectors of the states pi Ha p 2 , pi 

and p 2 , respectively. Recall from eq. (4.18) that r can be expressed in terms of fi and r 2 as follows: 

r = flri + (1 - a)r 2 + \Ja{l- a){fi x r 2 ). (A. 8 ) 

Since Fi and r 2 are both perpendicular to Fi x F 2 , we get 

|F|^ = F - F = fl^|Fi|^ + (1 — fl)^|F2|^ + 2fl(l — fl)Fi ■ F 2 + fl(l — fl)|Fi X F2|^. (A.9) 

If we denote by 7 G [0, tt] the angle between vectors Fi and F 2 , then Fi ■ F 2 = |Fi| IF 2 I cos 7 and 
|Fi X F 2 I = |Fi| IF 2 I sin 7 , so eq. (A.9) becomes 

= a^r\ + (1 — + a{l — a) ( 2 rir 2 cos 7 + sin^ 7 ). (A.10) 

Note that the right-hand side of the inequality (A.7) does not depend on the angle 7 between 
the vectors Fj and F 2 , so it suffices to prove eq. (A.7) only for those values of 7 that minimize the 
left-hand side. Since f{r) is a decreasing function of r for r > 0 (see Fig. 6), we have to consider 


only those values of 7 that maximize r. From eq. (A.IO) we have that 

r = \lcP-'r\ + (1 “ + '^(1 “ fl)rir 2 ( 2 cos 7 -|-rir 2 sin^ 7 ) (A.11) 

where a, ri, r 2 G [0,1]. To maximize this over 7 , we only need to maximize the last term. Note that 

2 cos 7 -|- rir 2 sin^ 7 < 2 cos 7 -|- sin^ 7 < 2, (A.12) 

where the last inequality is tight if and only if 7 = 0. This gives a simple upper bound on r: 

y < \JcP-'r\ + (1 ~ + 2fl(l — a)rir2 = ari -F (1 — a)r2. (A.13) 

Since £{r) is monotonically decreasing for r > 0, we get 

i{r) > £{ari + {1 — a)r 2 ). (A.14) 

Note that this lower bound is independent of the parameter 7 and is tight (it becomes equality 
when 7 = 0). Recall from eq. (A.4) that / is concave (see also Fig. 6 ), so 

F(flri -F (1 — a)r2) > a£{ri) -F (1 — a)l{r2)- (A.15) 

By combining the last two inequalities, we get the desired result. □ 


33 








